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Abstract 

We  consider  an  approach  to  testing  that 
combines  white-box  and  black-box  techniques. 
Black-box  testing  is  used  for  testing  a  program’s 
effects  against  its  specification.  White-box  test¬ 
ing  is  essential  if  subtle  implementation  errors 
are  to  be  identified,  e.g.,  errors  due  to  race 
conditions.  Full  white-box  testing  is  a  large 
task.  However,  for  many  properties,  only  a 
small  portion  of  the  program  is  relevant  —  hence 
property-based  testing  has  the  potential  to  sub¬ 
stantially  simplify  much  of  the  testing  work.  The 
portion  of  a  program  that  relates  to  a  given  prop¬ 
erty  can  be  identified  through  slicing.  We  de¬ 
scribe  the  ongoing  development  of  a  Tester’s  As¬ 
sistant,  which  in  the  long  term,  will  include  a 
specification-driven  slicer  for  C  programs,  a  test 
data  generator,  a  coverage  analyzer,  and  an  exe¬ 
cution  monitor.  The  slicer  and  execution  moni¬ 
tor  are  described  in  this  paper,  and  applications 
to  Unix  security  are  indicated.  Security  is  an  im¬ 
portant  application  of  property-based  testing  be¬ 
cause  of  the  subtle  undetected  security  errors  in 
delivered  operating  systems.  It  is  also  a  promis¬ 
ing  application  because  of  the  (unexpectedly) 
concise  specifications  that  capture  most  security 
requirements,  and  because  of  the  operating  sys¬ 
tem  support  for  execution  monitoring. 
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1  Introduction 

White-box  testing  involves  the  generation  of  test 
data  based  on  program  code.  Black-box  testing, 
traditionally,  has  involved  the  generation  of  test 
data  based  on  program  specifications.  White- 
box  testing  is  essential  if  subtle  errors  are  sus¬ 
pected,  especially  those  based  on  race  conditions, 
internal  program  state,  or  ’’windows”  where  the 
program  is  exposed/vulnerable  to  actions  that 
could  cause  failure.  Block-box  testing,  on  the 
other  hand,  helps  focus  testing  activities  on  the 
expected  behavior  of  the  program.  The  general 
opinion  of  the  testing  community  is  that  the  judi¬ 
cious  combinations  of  these  techniques  is  needed. 
This  paper  explores  the  combinations  of  black¬ 
box  and  white-box  testing,  but  employs  specifi¬ 
cations,  the  basis  for  black-box  testing,  to  slice 
a  program  to  yield  the  subset  of  the  program 
relevant  to  the  properties  required  by  the  speci¬ 
fications.  We  call  this  technique  property-based 
testing. 

Not  surprisingly,  the  program  properties 
which  are  to  be  to  the  focus  of  the  testing  ef¬ 
fort  depends  on  the  applications;  furthermore,  a 
given  application  might  involve  numerous  prop¬ 
erties. 

For  software  intended  to  implement  a  secure 
operating  system,  for  example,  the  critical  prop¬ 
erties  are  those  that  related  to  preservation  of  the 
security  state  of  the  system  to  the  unauthorized 
release  of  information,  or  to  the  preservation  of 
the  integrity  of  hies.  The  portion  of  the  pro¬ 
gram  which  addresses  these  critical  properties  is 
often  significantly  smaller  than  the  whole  pro- 
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gram;  therefore  property-based  testing  is  a  more 
efficient  testing  scheme  than  classical  white-box 
or  black-box  methods. 

Specifications  are  necessary  in  order  to  de¬ 
scribe  the  properties  which  drive  the  testing  pro¬ 
cess.  We  have  found  that  it  is  relatively  easy  to 
specify  most  security-related  properties.  It  is  not 
necessary,  in  general,  to  provide  a  full  specifica¬ 
tion  of  the  program’s  behavior,  as  in  the  work  de¬ 
scribed  in  [OT89],  and  as  we  demonstrate  below 
for  security  specifications.  In  some  instances,  no 
specification  of  the  program  itself  is  necessary  for 
detecting  security-related  flaws.  Certain  generic 
flaws  [Spa92]  been  shown  to  be  the  cause  of  many 
security  problems.  These  generic  flaws  can  be 
captured  in  specifications  for  use  in  property- 
based  testing.  By  not  requiring  full  specifica¬ 
tions,  property-based  testing  is  relatively  easy 
to  apply  to  systems  that  do  not  typically  come 
with  full  specifications. 

There  are  three  main  uses  of  specifications  in 
testing.  The  first,  slicing,  identifies  the  subset 
of  the  program  that  governs  behavior  relating 
to  a  specification.  The  second,  execution  mon¬ 
itoring,  uses  a  specification  to  construct  oracles 


which  detect  when  an  execution  exhibits  behav¬ 
ior  violating  the  specification.  The  third,  the 
use  of  specifications  for  test  data  generation,  the 
traditional  use  of  specifications  in  testing,  is  not 
emphasized  in  properly  based  testing,  as  the  role 
of  a  specification  in  this  case  is  mostly  subsumed 
by  its  role  in  slicing. 

Specifications  are  useful  for  all  these  purposes, 
whether  or  not  the  specifications  are  written  in  a 
formal  language.  However,  requiring  the  partial 
specifications  to  be  written  in  a  formal  language 
enables  the  use  of  automatic  tools  to  process  the 
specification. 

We  are  developing  an  environment,  the 
Tester’s  Assistant  (See  Figure  1)  for  property- 
based  testing,  to  demonstrate  the  usefulness  of 
the  methodology.  The  Testers  Assistant  has 
modules  for  slicing,  dataflow  testing,  test  data 
generation,  and  execution  monitoring1.  The  pri¬ 
mary  examples  to  which  the  Tester’s  Assistant  is 

1A  side  effect  of  the  use  of  the  Tester’s  Assistant  is 
a  test  suite  that  can  later  be  used  for  regression  testing. 
Through  coverage  monitoring,  test  data  can  explicitly  be 
associated  with  paths,  which  can  also  aid  in  regression 
testing 


being  applied  are  UNIX  utility  programs,  which 
are  ah  written  in  C  (e.g.,  login  and  rdist),  all  of 
which  are  security  critical.  Our  techniques  redis¬ 
covered  the  known  security  flaws  in  several  util¬ 
ity  programs,  including  Rdist;  to  date  we  have 
discovered  no  new  security  flaws  in  UNIX  al¬ 
though  the  testing  of  other  programs  is  ongo¬ 
ing.  In  addition  to  testing  sequential  programs, 
the  Tester’s  Assistant  is  being  designed  to  ad¬ 
dress  concurrent  and  distributed  systems  as  well. 
Many  of  the  known  security  flaws  appear  in  con¬ 
current  or  distributed  programs  on  multi-tasking 
UNIX  machines. 

Section  2  outlines  the  security  model  used  in 
property  specifications.  Section  3  defines  slicing 
and  discusses  the  issues  associated  with  it.  Sec¬ 
tion  4  describes  how  specifications  can  be  used 
to  monitor  program  execution.  Section  5  applies 
the  Tester’s  Assistant  methodology  to  several 
example  programs.  Section  6  sketches  out  the 
Tester’s  Assistant  implementation  and  reports 
on  current  progress.  Finally,  Section  7  spells  out 
some  conclusions  and  future  plans. 

2  Security  Specifications 

The  Tester’s  Assistant  is  heavily  reliant  upon 
specifications  and,  in  particular,  an  effective 
model  for  security  specifications.  In  this  section, 
a  model  for  UNIX  security  is  introduced,  and  ex¬ 
amples  of  the  various  kinds  of  specifications  are 
given. 

In  UNIX,  some  programs  have  privileges  be¬ 
yond  those  needed  to  complete  their  job.  (Fig¬ 
ure  2)  This  happens  because  the  granularity  of 
the  access  control  mechanism  may  not  be  fine 
enough.  To  determine  that  these  programs  do 
not  indeed  exceed  their  intended  privilege,  we 
test  them  with  respect  to  their  security  specifica¬ 
tions,  which  we  provide.  For  example,  in  UNIX, 
the  “/bin/passwd”  program  is  a  setuid  root  pro¬ 
gram,  which  means  the  program  will  run  with 
superuser  privileges  no  matter  who  executes  the 
program.  Therefore,  /bin/passwd  can  poten¬ 
tially  do  anything  in  a  UNIX  system  (e.g.,  plant 
a  trojan  horse,  kill  any  process,  shut  down  the 
system).  We  want  to  assure  that  the  program  is 


Figure  2:  Security  requirements  distinguish  pro¬ 
grams  which  behave  safely  from  those  who  do 
not.  For  some  tasks,  it  is  necessary  to  breach  a 
security  requirement;  these  tasks  are  assigned  to 
“trusted”  programs  which  can  transcend  the  se¬ 
curity  requirements  arbitrarily,  but  are  expected 
not  to  exhibit  unsafe  behavior. 


appropriately  restricted  in  wliat  it  can  do  in  the 
system.  Conversely,  we  want  to  ensure  that  cru¬ 
cial  activities  such  as  user  authentications  take 
place.  Without  specifications,  there  is  no  assur¬ 
ance  that  the  absence  of  authentication  routines 
will  be  noticed  during  the  testing  process. 

A  motivating  factor  behind  the  model  is  the 
desire  to  require  only  minimal  specifications  to 
be  provided.  There  are  several  reasons  for  this: 
specifications  are  hard  to  write;  if  large,  they 
are  hard  to  understand;  it  is  difficult  to  retrofit 
specifications  onto  pre-existing  code;  finally,  it  is 
desirable  to  test  programs  written  by  others,  for 
which  you  do  not  possess  full  specifications. 

Our  model  divides  security  specifications  into 
three  categories,  plus  an  additional  category  of 
safety  specifications;  these  categories  are  not  im¬ 
mutable,  but  they  provide  a  convenient  frame¬ 
work  from  which  to  define  the  specifications. 
The  primitives  of  the  specifications  are  users,  ob¬ 
jects,  and  access  rights  and  restrictions* 

The  first  category  describes  general  system  re¬ 
quirements.  These  are  properties  expressed  as 


predicates  which,  in  general,  programs  should 
not  violate.  An  example  of  a  system  requirement 
is  that  the  hie  that  stores  password  information 
is  not  accessed,  as  this  would  be  a  breach  of 
security2.  However,  it  is  obvious  that  in  some  in¬ 
stances  this  requirement  must  be  violated;  when 
some  program  needs  to  perform  an  authentica¬ 
tion  of  a  user,  a  check  is  necessary  against  the 
information  in  the  password  hie.  The  system 
requirements  form  a  strong  barrier  between  pro¬ 
grams  and  the  system. 

The  second  category  of  specihcations  dehnes 
the  interface  between  programs  and  the  system. 
Within  the  context  of  the  interface,  programs  are 
allowed  to  violate  hrst  category  specihcations. 
Each  system  call  that  is  related  to  security  has  a 
security-related  specihcation.  Some  system  calls 
have  pre-conditions:  predicates  that  must  be  sat¬ 
isfied  if  the  system  call  is  to  be  used  in  a  secure 
manner.  For  example,  the  setuid  system  call, 
which  gives  access  permission  to  a  process,  re¬ 
quires  that  the  user  be  authenticated.  This  as¬ 
sertion  results  in  the  specihcation  shown  below: 

•  pre-condition:  authenticated(uid) 

•  setuid(uid) 

•  post-condition: 

permission-granted(uid) 

This  specihcation  says  that  the  user  should 
be  authenticated  before  the  setuid  system  call 
is  made,  and  that  after  the  system  call  is  made, 
permissions  have  been  granted  for  that  user.  The 
post-condition  is  information  that  is  added  to 
the  security  state  after  execution,  and  can  help 
validate  (or  invalidate)  other  requirements  and 
pre-conditions. 

Library  routines  are  also  dealt  with  in  this 
manner.  Although  it  may  be  possible  to  break 
down  library  routines  into  their  component  com¬ 
putations  and  system  calls,  it  is  more  efficient  to 
assume  that  the  standard  libraries  operate  cor¬ 
rectly,  and  to  specify  the  library  functions  in  the 

2In  UNIX,  note  that  this  hie  (/etc/passwd)  is  in  fact 
globally  readable;  other  information  kept  in  the  hie  needs 
to  be  generally  available.  This  makes  it  more  difficult  to 
detect  security  breaches  related  to  the  password  hie,  as 
the  UNIX  security  policy  allows  read  access. 


same  way  as  system  calls  are  specified.  The  cor¬ 
rectness  of  the  libraries  can  be  established  in  an 
independent  test  run.  An  important  set  of  li¬ 
brary  routines  to  define  for  the  security  model 
is  the  password  library,  a  set  of  routines  which 
define  a  standard  interface  to  the  password  hie. 

The  third  category  is  program-specific  specih¬ 
cations.  Such  a  specihcation  details  when  a  pro¬ 
gram  is  allowed  to  circumvent  the  restrictions  of 
the  hrst  and  second  category  of  specihcations. 
For  example,  to  allow  users  to  change  their  pass¬ 
words  or  new  accounts  to  be  created,  special 
permission  must  be  given  in  order  to  write  to 
the  /etc/passwd  hie.  The  program  to  change 
passwords,  /bin/passwd,  is  given  a  specihcation 
which  exactly  specihes  the  scope  of  the  changes 
it  can  make. 

passwd(U:uid) 

write (password_f ile .U. password) 

Informally,  the  specihcation  says  that  when  a 
user  executes  the  “passwd”  program,  that  in¬ 
stance  program  can  write  only  to  the  password 
hie  and  only  the  entry  corresponding  to  the  pass¬ 
word  of  that  user. 

The  hnal  category  of  specihcation  describes 
safety  properties,  which  have  been  enumerated 
many  times  [Lut93]  [Spa92].  Safety  proper¬ 
ties  cover  common  programming  mistakes  which 
can  cause  haws.  These  properties  are  necessary 
to  handle  the  well-behavedness  property,  de¬ 
scribed  in  the  next  section. 

3  Slicing  for  Security  Proper¬ 
ties 

Slicing  is  an  abstraction  mechanism  in  which 
code  that  might  inhuence  the  value  of  a  given 
variable  or  set  of  variables  is  extracted  from  the 
full  source  code  of  a  program.  Weiser  [Wei84] 
originally  implemented  slicing  for  FORTRAN 
programs.  A  criterion  for  the  slice  is  selected. 
In  the  simplest  scenario,  a  criterion  is  a  vari¬ 
able  and  a  location  in  the  program.  There  are 
two  essential  characteristics  that  are  required  of 
a  slice  with  respect  to  such  a  criterion:  it  must 
be  executable,  and  for  the  same  input  values,  the 


variable  must  have  identical  values  at  the  cor¬ 
responding  location  in  the  slice  and  the  original 
program.  More  complex  criteria  can  involve  mul¬ 
tiple  variables,  multiple  locations,  and  include 
the  behavior  of  the  program  after  arriving  at  the 
location. 

A  slice  of  a  program  is  produced  by  generat¬ 
ing  control  and  data  flow  graphs  of  the  program, 
then  finding  a  closure  with  respect  to  the  “de¬ 
pends  upon”  property,  starting  with  the  nodes 
that  correspond  to  the  slicing  criteria.  For  sim¬ 
ple  imperative  languages,  generation  of  the  flow 
graphs  and  the  resultant  slice  is  relatively  easy. 
More  complicated  language  constructs  such  as 
procedures,  gotos,  and  pointers  require  alter¬ 
ations  in  the  slicing  algorithm.  Interprocedural 
slicing  [LC92]  and  slicing  for  programs  with  arbi¬ 
trary  control  flow  [BH92]  have  been  studied.  In 
order  to  slice  C  code,  the  problems  of  pointers 
and  pointer  aliasing  need  to  be  addressed. 

Weiser  noted  that  slices  are  implicitly  used  in 
debugging  [Wei82].  In  the  Tester’s  Assistant,  the 
uses  of  slices  are  made  explicit.  Slicing  impacts 
the  other  testing  algorithms  (e.g.,  coverage)  by 
enabling  an  algorithm  to  be  utilized  on  a  sub¬ 
set  of  the  original  program  (the  slice),  giving  an 
increase  in  efficiency.  For  example,  given  that  a 
slice  possesses  the  two  essential  characteristics, 
code  instrumentation  in  order  to  measure  cov¬ 
erage  with  respect  to  a  specification  need  only 
be  added  to  the  subprogram  designated  by  the 
slice.  Alternatively,  the  slice  could  be  compiled 
and  executed  in  isolation  from  the  rest  of  the 
program. 

Implicitly,  by  using  slices  as  the  base  unit  for 
testing  programs  for  security  holes,  we  are  as¬ 
suming  that  testing  a  slice  is  equivalent  to  testing 
the  whole  program.  The  equivalency  of  a  slice  to 
a  program  with  respect  to  some  narrow  crite¬ 
rion  does  not  necessarily  imply  the  equivalency 
of  test  results,  although  in  the  Tester’s  Assistant 
we  have  attempted  to  bridge  the  gap  by  relying 
on  the  completeness  of  the  model  defined  by  the 
security  specifications  to  describe  the  properties 
of  interest.  For  properties  that  have  not  been 
defined,  e.g.,  in  our  model,  properties  related  to 
program  correctness  (rather  than  security),  test¬ 
ing  program  slices  will  give  very  little  informa¬ 


tion.  A  test  of  a  (correct)  slice  that  was  created 
with  respect  to  a  specification  tests  that  specifi¬ 
cation,  but  not  necessarily  anything  else. 

An  aspect  of  a  program  which  potentially  jeop¬ 
ardizes  the  correctness  of  slices  is  unexpected 
side-effects  which  make  the  flow  graphs  very  dif¬ 
ficult  or  impossible  to  calculate.  This  difficulty 
arises  when  slicing  C  programs  because  the  lan¬ 
guage  allows  virtually  unlimited  pointer  arith¬ 
metic  and  also  allows  many  forms  of  pointer  (and 
function)  aliasing. 

To  address  this,  we  make  the  powerful  as¬ 
sumption  that  pointers  are  well-behaved.  A 
pointer  is  well-behaved  if,  once  assigned  to  an 
object  in  memory,  it  does  not,  through  pointer 
addition  or  typecasting,  try  to  refer  to  a  dif¬ 
ferent  object  in  memory.  A  program  is  well- 
behaved  if  all  of  its  pointers  are  well-behaved. 
Lo  [Lo92]  showed  that  static  analysis  can  es¬ 
tablish  the  well-behavedness  property  in  many 
cases.  The  well-behavedness  property  can  also 
be  affirmed  through  testing  with  respect  to  ap¬ 
propriate  safety  specifications.  Unfortunately, 
much  of  the  power  of  C  as  a  language  derives 
from  being  able  to  do  ill-behaved  operations. 
However,  in  many  cases,  this  ill-behavedness  is 
used  in  very  specific  ways  in  which  it  is  pos¬ 
sible  to  deduce  an  underlying  well-behavedness 
property.  For  example,  the  use  of  void  point¬ 
ers  and  type  casting  to  implement  generic  data 
structures  and  the  use  of  pointer  arithmetic  for 
array  indexing  are  both  classic  ways  in  which 
what  appears  to  be  ill-behaved  code  is  actually 
well-behaved. 

Using  a  slice  as  an  intermediary  between  a 
specification  and  coverage  monitoring  and  test 
data  generation  changes  the  nature  of  the  latter 
two  tasks.  Using  a  specification  to  generate  a 
slice  changes  the  way  it  is  used  to  generate  test 
data.  Because  slices  tend  to  be  small,  it  is  feasi¬ 
ble  to  use  symbolic  evaluation  to  create  test  data 
for  most  paths  in  a  slice.  This  is  an  indirect  use 
of  a  specification  in  test  data  generation.  Spec¬ 
ifications  are  also  used  in  test  data  generation 
to  identify  partitions  and  interesting  data  values 
(e.g.  /etc/passwd  for  filenames)  in  the  input 
domains  from  which  test  data  should  be  drawn. 
However,  by  definition,  a  slice  contains  all  code 


relevant  to  a  given  property,  so  coverage  mea¬ 
sures  are  only  necessary  for  paths  contained  in 
the  slice. 

Slicing  typically  is  performed  to  examine  the 
behavior  of  a  set  of  variables  at  a  specific  point 
in  the  program.  When  we  slice  for  properties, 
the  slicing  criteria  become  more  complex.  In  the 
simpler  cases  this  involves  the  values  of  variables 
at  different  points  in  the  program  in  a  cumula¬ 
tive  way,  i.e. ,  the  union  of  the  individual  slices. 
In  other  cases,  the  desired  slice  is  the  intersec¬ 
tion  of  slices,  or  is  decided  by  a  more  complex 
boolean  formula.  For  example,  when  attempting 
to  slice  with  respect  to  the  setuid  specification 
given  above,  the  slice  will  contain  all  data  paths 
which  traverse  the  setuid  but  do  not  traverse 
the  authentication  routine.  The  specification  of 
setuid  indicates  that  paths  which  traverse  both 
are  secure,  and  thus  do  not  need  further  test¬ 
ing.  Determining  slices  in  this  manner  is  called 
dicing  [LC91]. 

The  current  version  of  the  Tester’s  Assis¬ 
tant  slicer  does  a  data-flow  breakdown  of  the 
program,  and  can  find  simple  slices.  Signifi¬ 
cantly,  interprocedural  and  pointer  analysis  are 
not  complete,  so  slices  do  not  cross  procedural 
boundaries  and  have  to  make  limiting  assump¬ 
tions  about  pointer  behavior.  The  primary  cost 
in  slicing  is  producing  the  data-flow  representa¬ 
tion.  The  largest  example  to  which  the  slicer  has 
been  applied  is  the  rdist  server,  which  required 
over  three  thousand  data-flow  nodes  and  took 
between  seven  and  eight  seconds  to  execute. 

4  Execution  Monitoring 

In  this  section  we  present  our  approach  to  moni¬ 
toring  the  results  of  test  runs  to  determine  if  the 
results  are  as  expected.  We  show  that  for  the 
testing  of  trusted  programs  in  a  UNIX  environ¬ 
ment  (i.e.,  programs  that  typically  run  outside  of 
the  kernel,  but  use  kernel  services),  the  kernel, 
through  its  audit  services,  provides  data  that  is 
useful  in  the  analysis  of  test  runs. 

A  test  run  of  a  program  needs  an  oracle  that 
indicates  whether  or  not  the  execution  produced 
the  correct  results.  Manual  inspection  of  the  out¬ 


put  of  an  execution  is  infeasible  in  many  situa¬ 
tions,  and  at  best  error-prone,  especially  if  cor¬ 
rect  behavior  must  satisfy  numerous  security  re¬ 
quirements.  An  executable  oracle  derived  from 
the  security  requirements  is  desirable. 

The  use  of  specifications  as  oracles  to  test 
programs  is  not  new:  Richardson  [OT89]  and 
Sankar  [SH94]  [San89]  used  specifications  to 
generate  assertion-checking  functions.  What  is 
unique  about  our  approach  is  the  relative  (small) 
size  of  the  specifications,  and  the  ability  to  as¬ 
sociate  them  with  a  slice  of  the  program.  For 
certain  programs,  in  addition  to  adding  predi¬ 
cate  assertions,  we  track  an  abstract  “state”  of 
the  program  (for  instance,  the  current  uid). 

Testing  programs  with  respect  to  security 
specifications  has  a  distinctive  characteristic  as 
compared  with  the  testing  with  respect  to  gen¬ 
eral  properties.  Most  security  specifications  cap¬ 
ture  “safety”  properties  -  i.e.,  bad  things  should 
not  happen.  In  current  operating  systems,  the 
state  can  be  changed  only  through  invocation  of 
system  calls  to  the  kernel.  The  logs  of  all  system 
calls  made  by  the  program  contains  data  perti¬ 
nent  to  the  results  of  the  test.  Therefore,  system 
audit  trails,  which  record  all  the  system  calls  to 
the  kernel,  can  be  used  as  a  basis  for  checking 
whether  a  program  conforms  to  its  security  spec¬ 
ifications. 

A  significant  advantage  of  using  auditing  for 
testing  programs  is  that  auditing  is  a  service 
available  in  most  current  operating  systems  (e.g., 
Sun  Solaris).  For  many  utility  programs,  we  can 
directly  check  the  results  of  a  test  without  instru¬ 
menting  the  program,  thus  obviating  the  need  to 
insert  assertion-checking  functions  into  the  pro¬ 
gram  source  code.  Moreover,  auditing  is  done 
outside  of  the  tested  program,  guaranteeing  that 
neither  the  program  nor  its  test  data/results  are 
tampered  with.  Auditing  has  been  used  for  in¬ 
trusion  detection  systems,  where  audit  trails  are 
analyzed  in  real  time  to  detect  attacks  on  com¬ 
puter  systems. 

In  our  scheme,  the  program  in  question  is  in¬ 
voked  by  the  test  driver  with  appropriate  test 
input.  (See  Figure  3.)  With  auditing  enabled 
properly,  actions  of  the  program  are  recorded  as 


fail 


succeed 


\  / 
\  / 
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audit  trails  and  saved  into  tiles3. 

The  audit  trails,  which  consist  of  a  sequence 
of  audit  records  arranged  in  temporal  order,  are 
filtered  and  preprocessed  by  the  run-time  asser¬ 
tion  checker  to  yield  records  associated  with  the 
execution  of  the  program  by  the  run-time  asser¬ 
tion  checker.  The  checker  then  checks  the  results 
of  the  test  by  matching  the  audit  records  against 
audit-trail  rules,  which  are  derived  from  the  se¬ 
curity  specifications  describe  in  Section  2. 

In  Sun  BSM  audit  trails,  an  audit  record  sig¬ 
nals  an  occurrence  of  an  event,  which  is  identi¬ 
fied  by  the  event  ID  in  the  record.  In  general,  an 
action/operation  referred  to  the  security  specifi¬ 
cation  may  correspond  to  one  or  more  events. 
For  instance,  the  following  events  all  indicate 
a  write  to  a  hie:  AUD_0PEN_W,  AUD_OPEN_RW, 
AUD_OPEN_TW ,  AUD_OPEN_TRW.  Therefore,  a  reg¬ 
ular  expression  of  events  is  used  to  represent  a 
high  level  operation  in  the  security  specification. 
If  the  acceptability  of  an  action  with  respect  to 

°In  Sun  OS,  auditing  is  carried  out  by  an  audit  dae¬ 
mon,  which  collects  events  dehned  by  an  audit  specifica¬ 
tion  and  records  then  as  audit  trails. 


a  security  requirement  depends  on  the  current 
state  of  the  program  (e.g.,  authentication  must 
precede  the  setuid  call),  state  changes  associated 
with  program  variables  from  which  state  changes 
of  the  program  can  be  inferred  are  also  included 
in  the  audit-trail  results. 

Even  large  programs  have  unexpectedly  sim¬ 
ple  behavior  with  respect  to  security,  and  offer 
conformance  to  security  properties  which  can  be 
checked  from  audit  trails.  One  example  is  the 
rdist  program,  which  has  a  vulnerability  that 
enables  a  user  to  change  the  permission  bits  of 
any  hie.  However,  from  a  security  perspective, 
it  is  clear  that  rdist  should  not  change  the  per¬ 
missions  of  any  hies  not  owned  by  the  process 
which  calls  rdist,  a  property  that  is  easily  stated 
and  checked  at  runtime  just  by  analysis  of  audit 
trails.  Another  example  is  the  infamous  send- 
mail  program,  which  has  been  the  source  of  many 
security  problems4.  The  basic  function  of  Send- 

4One  of  the  vulnerabilities,  employed  by  the  internet 
worm,  enables  any  users  in  a  remote  host  to  obtain  a  root 
shell  in  the  host  running  sendmail.  The  most  recently 
discovered  vulnerability  enables  an  attacker  to  cause  root 


mail  is  to  deliver  mail  messages  to  users.  In  a 
networked  environment,  it  also  has  to  route  mail 
to  destination  hosts.  Sendmail  should  only  have 
append  access  to  a  user’s  mailbox  hie,  and  the 
abihty  to  execute  a  program  on  behalf  of  a  user 
when  a  mail  message  arrives  (e.g.,  vacation  pro¬ 
grams).  It  should  not  write  to  other  hies  (e.g., 
the  password  hie,  system  programs).  Again,  it 
is  easy  to  check  whether  sendmail  exceeds  the 
expected  behavior  from  the  audit  trails,  thus  en¬ 
abling  security  testing. 

On  the  other  hand,  using  auditing  as  a  basis 
for  testing  programs  has  limitations.  Most  cur¬ 
rent  auditing  systems  do  not  record  all  parame¬ 
ters  of  system  calls;  hence,  not  all  the  informa¬ 
tion  about  an  event  is  recorded.  For  instance, 
how  a  program  modihes  a  hie  cannot  be  inferred 
from  the  audit  trails.  In  addition,  some  specifi¬ 
cations,  like  authentications,  are  hard  to  check 
in  this  manner. 

Fortunately,  since  in  the  course  of  slicing  we 
are  parsing  and  regenerating  the  source  of  the 
program,  it  is  simple  to  add  code  to  the  pro¬ 
gram  for  the  purpose  of  monitoring  the  adher¬ 
ence  to  specihcations.  Additionally,  information 
from  the  slice  can  be  used  to  insert  monitoring 
code  at  exactly  those  locations  which  can  influ¬ 
ence  the  specihcation  assertion. 

5  Examples 

To  apply  the  methodology  of  property-based 
testing,  we  take  as  examples  UNIX  system 
programs,  and  consider  security-based  proper¬ 
ties.  Security  properties  are  easily  isolatable 
from  other  program  properties,  and  in  general, 
that  portion  of  the  program  which  deals  with 
security-related  objects  (e.g.,  hlenames  and  user 
ids)  is  a  signihcantly  smaller  subset,  so  slicing 
is  particularly  useful.  Picking  system  programs 
provide  us  with  a  large  example  suite,  many  of 
which  have  already  exhibited  flaws.  We  can  both 
compare  the  ability  of  the  Tester’s  Assistant  in 
detecting  these  known  flaws  and  also  test  the  sys¬ 
tem’s  ability  to  find  previously  unknown  flaws. 

to  execute  any  program  he  wants,  perhaps  a  program  with 
a  Trojan  Horse. 


Finally,  these  examples  provide  a  fairly  represen¬ 
tative  sample  of  the  range  of  specihcations  and 
programs  relevant  to  computer  security. 

To  illustrate,  Figure  4  is  a  slice  of  the 
MINIX  [Tan87]  login  program  with  respect  to 
the  setuid  system  call.  The  original  program 
contains  337  lines,  the  slice  only  20,  demonstrat¬ 
ing  the  effectiveness  of  slicing  in  this  case5. 

The  mapping  of  the  abstract  concept  of  au¬ 
thentication  to  source  code  in  the  MINIX  login 
program  is  an  example  of  the  difficulty  in  general 
of  mapping  specihcations  to  appropriate  slicing 
criteria.  While  this  slice  can  be  produced  by  slic¬ 
ing  the  code  with  respect  to  the  setuid  system 
call,  the  additional  information  in  the  specihca¬ 
tion  that  deals  with  authentication  is  necessary 
in  order  to  accurately  produce  an  oracle  that 
checks  the  adherence  to  the  setuid  specihcation. 

Two  other  interesting  examples  of  property- 
based  testing  are  the  UNIX  utility  rdist,  and 
the  UNIX  network  service  fingerd.  Rdist  was 
found  to  have  a  haw  that  would  allow  a  user  to 
alter  the  permissions  on  any  hie  in  the  system. 
Although  rdist  is  a  very  large  program,  its  haw 
can  be  detected  with  a  simple  security  require¬ 
ment:  that  no  hie  be  altered  unless  its  hie  name 
is  entered  as  input  to  the  program. 

Fingerd  was  the  cause  of  many  security 
breaches.  The  haw  that  caused  the  security 
breach  was  found  to  be  a  string  overhow  error. 
This  is  a  violation  of  a  well-behavedness  prop¬ 
erty,  which  will  be  found  by  testing  for  well- 
behavedness  requirements. 

6  Architecture  of  the  Tester’s 
Assistant 

The  hve  components  of  the  Tester’s  Assistant 
are  the  specihcation  language,  the  slicer,  the 
datahow  coverage  instrumenter,  the  execution 
monitor,  and  the  test  data  generator.  A  human 
tester  would  use  the  system  by  selecting  spec¬ 
ihcations  to  test  against,  and  some  initial  test 
data  for  the  program  to  be  tested.  The  execu¬ 
tion  monitor  detects  when  an  incorrect  execution 

5Line  counts  generated  using  the  UNIX  wc  command. 
The  loop  exit,  a  call  to  exec(),  is  not  shown. 


for  ( ; ; )  { 

bad  =  0; 

write(l,  "login:  ",  7); 
n  =  read(0,  name,  30); 

if  ((pwd  =  getpwnam(name) )  ==  NULL)  bad++; 

if  (bad  ||  strlen(pwd->pw_passwd)  !=  0)  { 
write(l,  "Password:  ",  10); 

n  =  read(0,  password,  30); 

if  (bad  &&  crypt (password,  "aaaa")  | | 

strcmp(pwd->pw_passwd,  crypt (password,  pwd->pw_passwd) ) )  { 
write(l,  "Login  incorrect\n" ,  16); 
continue ; 

> 

> 

setuid(pwd->pw_uid) ; 

> 


Figure  4:  Slice  of  MINIX  login  with  respect  to  setuid(). 


occurs.  If  no  incorrect  execution  occurs,  slices 
of  the  program  are  examined  for  their  dataflow 
coverage  information,  and  more  test  data  is  gen¬ 
erated  as  a  result  of  this  analysis. 

All  of  the  components  coordinate  through  a 
common  data-flow  program  representation.  To 
generate  the  dataflow  representation  and  correct 
the  instrumentation  necessary  for  coverage  anal¬ 
ysis  and  monitoring,  the  ELI  [W+93]  system  is 
used.  ELI  is  a  text  processing  and  compiler  con¬ 
struction  toolkit.  In  ELI,  high-level  specifica¬ 
tions  are  converted  into  high-performance  exe¬ 
cutable  translators  and  compilers.  Using  ELI, 
we  are  able  to  quickly  prototype  the  Tester’s  As¬ 
sistant. 

The  dataflow  representation  has  been  imple¬ 
mented.  The  slicer  is  partially  implemented: 
it  is  operational  on  single-procedure  programs 
using  most  of  C’s  operations  and  properties. 
Inter-procedural  slices  generated  by  algorithms 


from  [LC92]  will  greatly  increase  the  number  of 
programs  we  can  analyze.  Pointer  anti-aliasing 
will  be  introduced  to  make  slices  more  efficient. 
Currently,  the  slicer  makes  worst  case  assump¬ 
tions  about  pointer  aliasing,  though  the  well- 
behavedness  assumption  limits  the  scope  of  the 
aliasing  assumptions.  Library  and  system  calls 
are  being  specified  as  necessary.  Slicing  criteria 
are  currently  limited  to  a  single  variable  and/or 
system  call.  Technical  tasks  for  the  short  term, 
technical  tasks  are  to  improve  the  slicing  algo¬ 
rithm  and  to  provide  a  richer  language  for  ex¬ 
pressing  slicing  criteria.  A  harder  problem  to  be 
addressed  is  to  produce  templates  or  other  de¬ 
vices  for  automatically  associating  complex  con¬ 
cepts  in  the  specification  (like  authentication)  to 
source  code. 


7  Discussion 


We  are  investigating  the  technique  of  property- 
based  testing,  whereby  specifications  are  used 
to  slice  a  program  to  an  executable  subset  rel¬ 
evant  to  the  specification.  Traditional  methods 
are  used  to  derive  test  data  for  the  slice.  The 
specifications  are  reused  as  an  oracle,  when  the 
data  is  applied  to  the  slice.  A  testing  environ¬ 
ment,  the  Tester’s  Assistant,  is  being  developed 
to  evaluate  the  effectiveness  of  property-based 
testing  for  C  programs. 

Our  initial  application  has  been  UNIX  utility 
programs,  a  major  source  of  security  problems 
in  UNIX.  Although  we  have  not  captured  all 
security-relevant  behavior  of  UNIX  in  our  spec¬ 
ifications,  the  specifications  we  have  written  to 
represent  integrity  requirements  have  been  sur¬ 
prisingly  easy  to  produce  and  are  very  concise 
-  typically  just  a  few  lines  of  predicate  logic 
with  respect  to  security  properties.  The  reduc¬ 
tion  in  program  size  due  to  slicing  has  proved 
to  be  substantial.  With  the  use  of  system  audit 
trails  in  execution  monitoring,  results  of  many 
test  runs  can  be  analyzed  without  instrumenting 
the  tested  program,  which  simplifies  the  analysis 
of  test  data. 

At  the  current  moment,  work  is  continuing  on 
the  development  of  the  Tester’s  Assistant.  We 
are  broadening  the  set  of  examples  the  Tester’s 
Assistant  can  and  will  be  applied  to,  includ¬ 
ing,  in  the  farther  future,  extending  the  slicer 
to  handle  concurrent  constructs.  In  the  mean¬ 
time,  we  are  also  studying  other  applications 
(e.g.,  safety  properties)  where  the  specifications 
might  be  easy  to  write  and  the  benefits  of  slicing 
are  substantial. 
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